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Abstract 

We consider in this paper the semiparametric mixture of two distributions equal 
up to a shift parameter. The model is said to be semiparametric in the sense that the 
mixed distribution is not supposed to belong to a parametric family. In order to insure 
the identifiability of the model it is assumed that the mixed distribution is symmetric, 
the model being then defined by the mixing proportion, two location parameters, and 
the probability density function of the mixed distribution. We propose a new class of 
M-estimators of these parameters based on a Fourier approach, and prove that they 
are ^/n-consistent under mild regularity conditions. Their finite-sample properties are 
illustrated by a Monte Carlo study and a benchmark real dataset is also studied with 
our method. 
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1 Introduction 

The probability density functions (pdf) of d-variate multicomponent mixture models are 
defined by 

k 

g{x) = Y,hfi{x), xGM^ (1) 

i=l 

where the unknown proportions Aj (Aj > and — 1) ^^id unknown pdf /j are to 

be estimated. Generally the fiS are supposed to belong to a parametric family of density 
functions turning the inference problem for model ([T]) into a purely parametric estimation 
problem. There exists an extensive literature on this subject including the monographs of 
Everitt and Hand (1981), Titterington et al. (1985) or McLachlan and Peel (2000), which 
provide a good overview of the existing methods in this case such as maximum likelihood, 
minimum chi-square, moments method, Bayesian approaches etc. Note that the estimation 
of the number of components k in model ([T]) may also be a crucial issue leading to various 
rates of convergence for maximum likelihood estimators, as discussed by Chen (1995). In 
that case, the selection model is an important topic, see for example Dacunha-Castelle 
& Gassiat (1999), Lemdani &: Pons (1999), and Leroux (1992). In addition the choice of 
a parametric family for the /j's may be difficult when few informations are known from 
each subpopulations. However, model ([T| is generally nonparametrically nonidentifiable 
without additionnal assumptions. This is no longer true when training data are available 
from each subpopulation; see for example Cerrito (1992), Hall (1981), Lancaster & Im- 
bens (1996), Murray & Titterington (1978), and Qin (1999). Hall and Zhou (2003) first 
considered the case where no parametric assumptions are made about the /j's involved in 
model ([T]). These authors looked at d-variate mixtures of two distributions, each having 
independent components, and proved that, under mild regularity conditions, their model 
is identifiable when d > 2>. They propose in addition -yn-consistent estimators of the 2d 
univariate marginal cumulative distribution functions and the mixing proportion. Even 
if model ([T]) is not nonparametrically identifiable there exists for d = 1 and k > 2, many 
real data sets in the statistical literature for which such a model is used under parametric 
assumptions on the /j's, such as the Old Faithfull dataset, see Azzalini & Bowman (1990), 
which corresponds to time measurement (in minute) between eruptions of the Old Faithfull 
geyser in Yellowstone National Park, USA. Another famous example deals with average 
amounts of precipitation (rainfall) in inches for United States cities (from the Statistical 
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abstract of the United States, 1975; see McNeil (1977). These data sets are both included 
in the R statistical package. 

To model from a semiparametric point of view this type of data {d = 1 and A; > 2), 
Bordes, Mottelet & Vandekerkhove (2006) (in abreviate BMV) and Hunter, Wang & 
Hettmansperger (2007) (in abreviate HWH) proposed jointly to consider i.i.d. sample 
data {Xi, ...,Xn) drawn from a common pdf g satisfying 

k 

g{x) = ^Kf{x-fi^), xeR, (2) 

i=l 

where /ij G M, Ai > for all i G {1, ...,k} such that Y2i=i = 1 and / is an unknown 
pdf. When / is supposed to be symmetric about zero, that is /(x) = f{—x) for all x G M, 
the above authors proposed M-estimation methods based on the cumulative distribution 
function (cdf) in order to estimate separately the Euclidean and functional part of model 
([2]). The crucial part of their work deals with the identifiability of model ([2]) under the 
simple symmetry assumption on /. Their basic results are established in BMV, Theorem 
2.1 and HWH, Theorem 1, 2 and Corollary 1. The mixed density 5 in ([2]) can also be seen 
as the density of i.i.d. observations Xi in a convolution model: 

Xi = Zi + ei, i = l,...,n, (3) 

where Zj's are i.i.d. with common pdf / and independent of i.i.d. errors e-j's with discrete 
law such that P(e = fn) = Xi, for i = l,...,k. Previous results mean that, if k is 
known and / is supposed to be symmetric about 0, then we can identify the law of the 
errors and esimate nonparametrically the pdf /. Let us notice that the mixture problem 
in ([2]) and the deconvolution problem in ([s]) are the same. They are both an inverse 
problem with unknown operator (i.e. convolution with an unknown law having support 
on k unknown points). In particular when k = 2, \i := pQ and {fii, fj.2) ■= (ao,/3o), 
according to Theorem 2.1. in BMV, such a model is identifiable if the Euclidean parameter 
(^0 ■= iPo,oo,Po) G [0,1/2) X \ A, where A = {(x,x); x G M} and the mixed density 
/ is symmetric about 0. When k = 2, BMV prove, under mild conditions, that both 
the Euclidean parameter and the cumulative distribution function of / of model ([2]) are 
estimated almost surely at the rate n~^/^+", for all a > (see Theorem 3.3 and 3.4). 
When A; = 2 or 3, HWH prove under mild conditions, the strong consistency of their 
estimator, and establish, under very technical conditions, its asymptotic normality (see 
Theorems 3 and 4 therein). 
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In this paper we propose to investigate a new estimation method. Let us first recall 
that BMV propose an iterative procedure to invert the operator and a contrast which is 
based on the cdf G and the symmetry of the underlying unknown pdf /. HWH introduce 
a contrast based on the cdf of the observations G and estimate the euclidean parameter 
using the symmetry property of the unknown pdf /. Here, we use Fourier analysis to 
invert the operator and see that under identifiability assumptions the inverse problem is 
well posed. Then we construct a contrast based on characteristic functions of our data 
which allows to estimate 9 when / is symmetric. This contrast is a functional of g which 
is estimated by a U-statistic of order 2 at parametric rate under very mild smoothness 
assumption on / (Sobolev smoothness larger than 1/4). Our procedure is easier to deal 
with and allows to get a central limit theorem for the estimator of 9 under much simpler 
conditions than those of Theorem 4 in HWH. Moreover, we define a kernel estimator of the 
pdf / and prove that it attains the same nonparametric rate as in the direct problem of 
density estimation. The inverse problem does not affect the pointwise rate of convergence 
of the density estimator. Our estimators and convergence results generalize to the mixture 
model with > 3 components, as soon as the model verifies identifiability assumptions. 
Such assumptions are known for A; = 3 only, see Corollary 1 in HWH. 

The paper is organized as follows: in Section 2 we propose a contrast function based 
on a Fourier transform of the dataset pdf and derive our estimation method; in Section 
3 we present our main asymptotic result which concern the -^/n-rate of convergence for 
the Euclidean part of the parameter and show that the classical nonparametric rate of 
convergence is achieved for our inverse Fourier nonparametric estimator; Section 4 is 
dedicated to auxiliary results and proofs; in Section 5 we propose a Monte Carlo study of 
our estimators on several simulated examples and implement our method on a real dataset 
which deals with the average amounts of precipitation (rainfall) in inches for United States 
cities, see McNeil (1977). 

2 Estimation procedure 

We observe Xi, . . . , X„ independent, identically distributed random variables having com- 
mon pdf 5 in the model 

g{x)=pQf{x-aQ) + {l-pQ)f{x- Pq), xGM, (4) 
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where := {Po,cto, Po) denotes the unknown value of the Euchdean parameter and / G L2 
is unknown, symmetric pdf in a large nonparametric class of functions. 

For identifiability reasons, let 9o belong to a compact set @ C (0,1/2) x \ A. 
Therefore, there are positive P*, P, which are smaller than 1/2, such that pq £ 

Note that in case po = we can still identify Pq but not ao- As this case reduces to 
the estimation of the location of an unknown symmetric pdf / as in Beran (1978), we do 
not consider this case further on. 

From now on, we denote by f*{u) = J^e^^'^f{x)dx the Fourier transform and recall 
that if /* G Li we have the inversion formula f{x) = (27r)^^ e~^^^ f*{u)du. 

Let us denote M{6, u) := pe*"" + (1 — p)e*"^, for all G and n G M, and see that it 
cannot be as soon as p ^ 1/2. It is enough to notice that (1 — 2P)^ < \M{9, ti)P < 1 for 
all {u, 9) £Rx&. 

The contrast uses the symmetry of the underlying, unknown pdf /. For the first time 
in the literature of mixture models, we relate the symmetry of / to the fact that its Fourier 
transform has no imaginary part. More precisely, in model Q 

g*iu) = (poe'""» + (1 -po)e™^'')r(n) = Mieo,u)f*iu), uGR. 

When / is supposed to be symmetric about 0, we can hope that Im{g*{u)/M{0,u)) = 0, 
for all G M, if and only if = 9q. This basic result is formally stated in the following 
theorem. 

Theorem 1 Consider model with f symmetric about and 9o £ Q. Then we have 
Im {g* /M{6, ■)) = for some 9 £ @ if and only if 9 = 9q. 

Proof. Notice that for all G G such that Im (g* /M(9, •)) = we explicitly have 

for all u G M. As /*(0) = 1, we get that Im{M{9o, ■)M{9, •)) is null in a neighborhood of 
which leads, following the proof of Theorem 2.1 in BMV, to the wanted result 9 = 9q. 
m 

Assuming g* known we can recover the true value of the Euclidean parameter by 
minimizing the discrepancy measure S defined by 
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where is a Lebesgue-absolutely continuous probability measure supported by M. 
Note that we can also write 

From now on, z denotes the complex conjugate of z. 

Proposition 1 The function S in is a contrast function, i.e. for all 9 £ Q, S{9) > 
and S{9) = if and only if 9 = 9o. 



Proof. The Fourier transform /* being continuous, the same holds for Im yj^jg-^j- By 
Theorem 1, if ^ 7^ there exists G M such that Im ^ M{e^uo) ) ^' there exists 

M{e,u) ^ 

«0+7 



e > and 7 > such that Im ( ) > e on [uq — 7, mq + 7] • It follows that 



S{9) > / dW{u) > 0. 



Otherwise if ^ = it is straightforward to check that S{9) = 0. ■ 

Discussion. We point out that basic results similar to Theorem [T] and Proposition [T| can 
be established for model Q when k = 3 under sufficient identiability conditions. Indeed, 
in that case, it is enough to replace 9 by (Ai, A2, /Ui, /X2) /^s)"^ and M{9, u) by X]j=i A^e*"'^^' 
and check that the analog of Theorem [T] can be established following the Proof of Lemma 
A. 1, under conditions provided in Corollary 1, in HWH. Finally, similar estimators to 
those in Sections 2.1 and 2.2 and asymptotic results like those established in Section |3] for 



k = 2, can be established with a little extra work for k = 3. 

2.1 Contrast minimization for the Euclidean parameter 

Let the estimator of 9q be the following M-estimator 

6ln = argminS'n(6'), (6) 
eee 

where Sn{9), depending on some parameter h > (small with n), is the following estimator 
of S{9) 
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M{e,u) M{e,-uy 

9*{u) 9*{-u) 
M{e,u) M{9,-u)' 



The estimator Sn{d) is inspired by kernel estimators of quadratic functional of the pdf / 
as previously studied in Butucea (2007). It is written here in the Fourier domain. It is 
known that by removing the diagonal terms in the double sum (i.e. taking j ^ k) the bias 
is reduced with respect to the estimator where we plug an estimator of g* into S{9). 
Let us denote by 

ZkiO,u) := 

J{9,u) := 
Then it is easy to see that 

-^"^^^ = Ar,(~^ n E / Zk{0,u)Zji9,u)dWiu), 

S{9) = -] [ J\9,u)dWiu), 
and that E[Zk{9,u)] = J{9,u). 

2.2 Kernel based nonpcirametric estimator 

After estimating the Euclidean parameter, we want to estimate the nonparametric function 
/. We suggest to use cross-validation for a kernel estimator as follows. We denote by 9n-k 
the leave-one-out estimator of 6q, which uses the sample without the fc-th observation. 
Then wc plug this in the classical nonparametric kernel estimator, whenever the unknown 
^0 is required. This procedure gives, in Fourier domain, 

au)-'.f.'^3^. (8) 



n 



where K the kernel (J K = I and K £ L2) and 6„ the bandwidth are properly chosen. 
Note that G*^{u) := K* (bnu)/M{9n-k:U) is in Li and L2 and has an inverse Fourier 
transform which we denote by Gn{u/bn)/bn- Therefore, the estimator of / is 



(9) 



It is important to notice at this step, that the estimator /„ is obtained by inversion of 
a nonparametric kernel estimator 



9i 
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with kernel K and bandwidth bn- The inversion is done in Fourier domain with the 
estimated 9n-k instead of the true 

When deahng with the rain fall dataset studied in Section 4, we propose to consider, as in 
BMV, the version /„ of the estimator fn{x) (which has a negative part due to the small 
number of observations) defined by 

/n(-) = /f?'y-^^° • (11) 

3 Main results 

Let us state first several assumptions. 

Assumption A Let : M — )• M"^ be a cumulative distribution function of some random 
variable which admits finite absolute moments up to the third order: 



/ (1 + \u\ + + \u\^)dW{u) < oo. 



Assumption B We assume that the underlying probability density / belongs to a ball of 
radius L > in the Sobolev space of functions having smoothness /? > 0: 

W^(/3,L) = |/:M^M+: // = 1,/ ir(n)|>P''dn < l| , 

where /* denotes the Fourier transform of the function /. 

The weight function W has been introduced for integrability of our estimator SniO) 
of the criterium S{9) and its derivatives with respect to 6. It is completely arbitrary and 
it may help compute numerically the values of our integrals by Monte-Carlo simulation, 
but it slightly reduces the asymptotic efficiency of We could have used integrals with 
respect to the Lebesgue measure for highest efficiency of On, but this would require stronger 
assumptions of smoothness and moments for the unknown probability density function /. 

Proposition 2 For each 9 £ Q, the empirical contrast function Sn{-) defined in ^ with 
/i — )• when n — )• oo, is such that 



sup sup E 

f£W(i3,L) eee 



{Snie)-s{e)f 



< ^ /i4/3 , 1 , 1 

-(1-2P)4 ^(l-2P)2n (l-2P)4n2' 



as n ^ oo. 
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An easy consequence of the Theorem is that \SniO)—E{Sn{0))\ = Op{n~^^'^) as n — )• oo. 

Moreover, if we choose h = o{l)n~^^^^^^ the squared bias of Sn{0) is infinitely smaller 
when compared to its variance. So the mean squared error converges at rate as soon 
as /3 > 1/4. 

Theorem 2 The estimator On defined in |^ converges in probability to the true value of 
the Euclidean parameter 9q as n ^ oo. 

Theorem 3 The estimator 0^ defined in with /i — t- such that h = o{l 
asymptotically normally distributed: 

V^{^n — Oq) a A^(0, S), as n ^ oo, 

where S = J-VX, X = X(0o) = -\ J^jieo,u).F{eo,u)dW{u), V = \E{Ui{eo)Uj {60)) 
andUi{0o) = J^Zi{eo,u)j{eo,u)dW{u). 

The next theorem gives the upper bounds for the rate of convergence of the nonpara- 
metric estimator of /, at some fixed point x, over Sobolev classes of functions. The 
main message of the theorem is that, if /3 > 1/2 then the nonparametric rates for density 
estimation are reached, provided a correct choice of parameters h and 6„. This might 
seem surprising, but it is again related to the fact that the inverse problem under consid- 
eration is well posed and the estimation of the Euclidean parameter 6q does not affect the 
nonparametric rate for estimating /. 

Theorem 4 Let the estimator On of 6 be defined in ([^ and fn{x) the estimator of f{x) 
at some fixed point x & M. in with h = o(l)n~^/(^^), 6„ = cn"^'^""'^/^)/^^^) for some 
c > and a kernel K in Li and in L2 with Fourier transform K* having support included 
in {u : \u\ > 1}. 
///3> 1/2, 

limsup sup sup Eeoj n "^p \fn{x)-f(x)\ <C, 

for some constant C < 00 which depends on j3, L, P and on f . 
We can choose an arbitrary point G G and write 

r 2,3-1 -1 r 2)3-1 

sup sup^eo,/ n 2^ \fn{x)- f{x)\^ > sup Eej n - 

/GW{/3,L) foee L J f(iW(fi,L) L 
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The lower bounds are known in the case of density estimation from direct observations, 
see for example results for more general Besov classes of functions in Hardle et al. (1998). 
They generalize easily to our case, with fixed 9. 

4 Simulations 

We implement our method and study its behaviour on samples of size n = 100. The mean 
behaviour of our estimator On of is calculated by replicating M = 100 times the same 
experiment. We considered that the underlying symmetric density is either Gaussian, 
Cauchy or Laplace. We give the mean value of the estimated parameter and its standard 
deviation in Tables 1, 3 and 4, respectively. We also plot the nonparametric estimator of 
the underlying density as compared to the true, in Figure 1. 

We see that smaller is p, smaller is the standard deviation of /3„. This is indeed 
intuitively clear, as 1 — p which is larger represents the fraction of data sampled from the 
second population or else the amount of information about the population which is located 
at /3. 

We note that the previous estimation methods based on the distribution function 
require usually finite moments up to some order. These methods cannot deal with the 
Cauchy density that we consider here, see Table |3j Indeed, our method is based on 
Fourier transform, which is fast decreasing in this case. We also consider non smooth 
Laplace density (or double exponential), see Table |4j Its Fourier transform is slowly 
decreasing, but we chose the weight function w{x) = e"'^' in order to deal with this 
problem. Therefore, all integrals have relatively small support of integration and the 
computation is fast enough. 

In the Table [2] we propose to illustrate the sensitivity of our method with respect to the 
symmetry assumption by considering a symmetric case against various shapeless mixed 
distributions close to the symmetric case. 

Comments on Table 1-4- Comparing the rows 3 and 5 of Table [T] with the rows 2 and 

5 of Table 2 in BMV, it appears that our estimator is clearly less unstable than the 
estimator proposed by these authors when / is the M{0, 1) pdf. Table 2 summarizes 
the performance of our method in slightly shapeless situation where / is the pdf of the 
W(0. 5,^/2) + (1 - A)A/'(-0.5A/(l - A),\/2) distribution satisfying f^xf{x)dx = and 
J^x^f{x)dx = 1, for ah A G (0,1). When A = 0.5 (/ is a symmetric bimodal pdf with 
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n 


(po,ao,/3o) 


Empirical means 


Standard deviations 


100 


(0.05, 


-1, 2) 


(0.0808, 


-1.0398, 2.0181) 


(0.0477, 0.3038, 0.1354) 


100 


(0.10, 


-1, 2) 


(0.1205, 


-1.0433, 1.9990) 


(0.0478, 0.2829, 0.1569) 


100 


(0.15, 


-1, 2) 


(0.1609, 


-0.9874, 2.0093) 


(0.0406, 0.2964, 0.1455) 


100 


(0.25, 


-1, 2) 


(0.2389, 


-0.9848, 1.9458) 


(0.0407, 0.2936, 0.2059) 


100 


(0.35, 


-1, 2) 


(0.3338, 


-1.0049, 1.9278) 


(0.0439, 0.3151, 0.2200) 


100 


(0.45, 


-1, 2) 


(0.4194, 


-0.9836, 1.9683) 


(0.0362, 0.2996, 0.2727) 



Table 1: Empirical means and standard deviations (from M = 100 samples of size n) of 
the estimator On = {pn, $n) of = {po, cto, Po) when / is standard Gaussian. 



n A Empirical means 

100 0.5 (0.2302, 

100 0.55 (0.2299, 

100 0.6 (0.2330, 

100 0.65 (0.2289, 



Standard deviations 



-1.0153, 1.9420) (0.0390, 0.2949, 0.2627) 

-1.0206, 1.9639) (0.0418, 0.3319, 0.2693) 

-0.9703, 1.9637) (0.0402, 0.3134, 0.2808) 

-0.9938, 2.0434) (0.0399, 0.2572, 0.2744) 



Table 2: Empirical means and standard deviations (from M = 100 samples of size n) of the 
estimator 0„ = (j}„, a„, /3„) of 6*0 = (0.25, —1, 2) when / is the pdf of a mixture distribution 
X\f{0.5,V2) + {l-X)J\f{-0.5X/{l-X),V2), obtained by considering A = 0.5,0.55,0.6,0.65. 



n (poiQ^Oi/^o) Empirical means Standard deviations 

100 (0.2, 1, 5) (0.1987, 0.9888, 5.0116) (0.0620, 0.3127, 0.2199) 

100 (0.2, 1, 2) (0.1915, 1.1103, 1.9728) (0.0580, 0.2374, 0.2630) 

100 (0.2, 1, 1.5) (0.2068, 1.0815, 1.5358) (0.0588, 0.2267, 0.2219) 

100 (0.2, 1, 1.2) (0.2092, 1.0890, 1.1871) (0.0626, 0.2398, 0.2452) 

Tabic 3: Empirical means and standard deviations (from M = 100 samples of size ra) of 
the estimator 6n = {pn, ^n, Pn) of = {po, cto, Po) when / is standard Cauchy. 
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n 


(po,ao,/3o) 


Empirical means 


Standard deviations 


100 


(0.05, 


-1,2) 


(0.0520, 


-0.9768, 2.0034) 


(0.0280, 0.4276, 0.1704) 


100 


(0.15, 


-1, 2) 


(0.1518, 


-0.9765, 1.9769) 


(0.0317, 0.4109, 0.1802) 


100 


(0.25, 


-1, 2) 


(0.2447, 


-1.0103, 1.9886) 


(0.0290, 0.4423, 0.2056) 


100 


(0.35, 


-1,2) 


(0.3432, 


-0.9602, 1.9407) 


(0.0297, 0.4014, 0.2344) 


100 


(0.45, 


-1,2) 


(0.4300, 


-0.9710, 1.9547) 


(0.0315, 0.4114, 0.3158) 



Table 4: Empirical means and standard deviations (from M = 100 samples of size n) of 
the estimator 0„ = (p„, q„, /3„) of = (Po, «o, Po) when / is Laplace. 




Figure 1: Underlying density (solid line) and kernel estimator (dashed line) for a) Gauss 
density, b) Cauchy density and c) Laplace density. 
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mean and variance equal to 1) it is then interesting to compare the performance of our 
method, see row 1 of Table [2j with its performances in the similar Gaussian case, see row 
4 in Table [l| the noticeable fact being that the variance of /3n is smaller in the Gaussian 
case. When A = 0.55, 0.6, 0.65 the bias of j5„ is badly affected when the standard devia- 
tions of the estimators is stable. The results provided in Table 3 seems to show that the 
heavy tails of the Cauchy distribution have essentially a bad influence on the standard 
deviation of j5„ . Comparing Table 1 and Table 4 it appears that the peak on the graph of 
the Laplace pdf helps to estimate the parameter po but do not work in favor of the other 
parameters. 



Rainfall dataset. In this paragraph we propose to study the performances of our method 
when compared to the results obtained in BMV. We have implemented the Gauss kernel 
estimator with bandwidth 5„ = 2n~^/^, n = 70, and used in ([s]), instead of On. _k, the 
estimator 0^- When K is the Gauss kernel, we explicitly have 

fn{x) = - / Q{hn,9n]u)\pnCOs{u{Xk ~ X - an)) + (1 - Pn) COs(n(Xfe - X~ (3n))]du, 



k=l 

where 



1 e-62„2/2 
,b;u) := — X 



2tt 2p2 -2p + l + 2p(l -p)cos(u(a -/?))■ 

The results provided by our method are p„ = 0.15, Q!„ = 12.7, /3„ = 38.5 and the be- 
havior of the functional estimators is summarized in Figure [3] Before commenting the 
good performances of our estimator {On, fn) in Figure [ij it is crucial to notice that the 
reconstruction of the pdf g by g§^ f^ (•) = Pnfn{--an) + (1 -Pn)fn{--Pn) coincides with gn 
itself, according to ([s 11) and replacing 6n-k by On- This basic phenomenon is illustrated 
in Figure [2] As mentioned in Section [2?2| the function /„ is not necessarily a pdf due to its 
negative part (coming from the small size of n and the fact that model Q is not necessar- 
ily the true underlying model), hence it is needed to regularize /„ into /„ which leads to 
consider, on this real dataset, /„ = 0.9644 x fnIf„>o- This modification explains the fact 
the graph of g^ j = Pnfn{- — Oin) + — Pn) fn{- — f^n) does not match exactly the graph 
of a. 

f — 9n- Actually we observe that the graph oi / (*) fits almost perfectly the 
graph of gn in the interval [0, 80], when it generates an extra bump in the interval [-20,0]. 
Nethertheless when comparing our graphs to the graphs obtained in BMV (including a 
comparison with the two-component Gaussian mixture model), we observe that we both 
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Figure 2: Rainfall dataset. In blue the graph of Pnfn{- — cin), in red the graph of 
(1 -Pn)/n(- -/3n), in green the graph of = Pnfn{--an) + (1 -Pn)/n(- - Pn) = Qn 

obtained with hn = 2.5. 




Figure 3: Rainfall dataset. a) Graph of /„ /; b) In blue the graph of a„), in red the 

graph of (l-]5„)/„(--/3„), in black the graph oi gg^ j^{-) = Pnfn{--oin) + {l-Pn)fn{--l^n), 
in green the graph of Qn obtained with /i„ = 2.5. 
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have the extra bump issue on the intervall [-20,0], on the other hand we better estimate 
the two first bumps appearing on the graph of gn within the interval [0,20]. We think 
that our methodological approach performs better than the existing one, mainly because 
we do not symmetrize our functional estimator in order to mimic as much as possible 
the shape of /„ (which shapeless is precisely the reason why j = gn, see Figure j2|. 

5 Auxiliary results and Proofs 

Let us use the notation \\v\\ for the Euclidean norm of a vector w G and \\A\\2 = tr{A^ A) 
for any matrix A in M'^^'^. 

Lemma 1 1. For all u G M, we have 



max{sup|Zfc(6','u)|,sup|J(6',n)|} < — , 

eee eee i — 



for any k from 1 to n. 
2. For all u G M, we have 



max{sup||Z,(g,n)||,sup||j(g,^)||}< y + 'pj^ . 



for any k from 1 to n. 
3. For all n G M, we have 



\ZkiO,u)\\2 < 



Cil + lul+u"^) 



(1-2P)3 ' 

for some absolute constant C > 0, for any 6 & Q and for any k from 1 to n. 
Proof 1. It is easy to see that \Zj{e,u)\ < 2/|M((9,u)| < 2/(1 - 2P) and that 



\Jie,u)\<2 



9*{u) 



2. We note that 



M{e,u) 
\ 



< 



(1 - 2P) 



-iuXk 



Zk{0,u) 



' M^{e,u) 



lupe 



+ 



\ iu{l - p)e''^l^ ) 



M'^{9,-u) 



-lupe 



\ -m(l -p)e-*"^ ) 
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and that 



E[Zk{e,u)\ = j{e,u) 



We have 



1 

(1 -2P)2 
and the same goes for Zk{6,u). 
3. We write briefly 



lupe 



\ ( 



+ 



M'^{e,-u) 



-lupe 



y —iu{l — p)e J 



\\jie,u)\\ 



M{0,u) + -^^j-^M{d,-u)\\ 



M'^{e,-u)' 



< 



2. 2^^l/2 / 4(1 + |n| 



2(22+pV + (l-p)V))^^^< 



(1 - 2P)2 



Zk{9,u) 



piuXk 



-M{e,u) + 



-iuXk 



M'^{e,-u) 



M{e,-u) 



]vp{e,u 

We deduce our bound from above. ■ 
Lemma 2 1. For all u ^M., we have 

\\Zk{e,u)-Zk{e',u)\\ < \\e 

for any 9, 6' £ Q and any k from 1 to n. 
2. For all u G M, we have 



M{6, u) ■ M{6, u)^ - 2—— -M(B, -u) ■ M{e, -u)^ . 



M^{e,-u) 



3,,, C(l + |n|+n2) 



(1 - 2PY 



\Zk{e,u)-Zu{e\u)h<\\e 



C{l + \u\+u^ + \u\ 



" (1-2P)4 

for some absolute constant C > 0, for any 9, 9' £ Q and for any k from 1 to n. 

Proof. The proof uses a Tayfor expansion and bounds from and similar to the Lemma [T} 
■ 

Proof of Proposition^ It is easy to see that E[Zk{9,u)] = J{9,u). Therefore the 
estimation bias is 



\E{Sn{9))-S{9)\ 



1 



g*{u) g*{u) 



^J\u\>i/h\M{9,u) M{9,u) 



dW{u) 



< 



< 



Im 



dW{u) 



\u\>i/h\ M{9,u)_ 

^ ^ \g*{u)\'dW{u). 

\u\>l/h 



(1 - 2py 
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If we assume / G 5(/3, L), for some /3 > and L > 0, then 



\E{Sn{e)) - s{d)\ < 



(1 - 2Pf 



(12) 



We have for the variance 



Var{Sn{e)) 
1 



16 



-E 



1 " 



(Zj(e,'u)Zfe(0,u) - J^{e,u))dW{u) 



It decomposes in Var{Sn{0)) = i4(r„ + ^n), where 



E 



V„. = E 



1 " 



n(n — 1) 



{Zj{e,u) - j{e,u)){Zk{e,u) - j{e,u))dwiu) 



- V / (Zkie, u) - j{e, u))Ji9, u)dw{u) 



Indeed, random variables in the previous sums are uncorrelated. Let us study the asymp- 
totic behavior of these terms. On the one hand, 

E (f {Zi{e,u)-j{e,u)){Z2{e,u)-j{e,u))dw{u)] 



1 



< 



n{n — 1) 
1 

n{n — 1) 



\u\<l/h 



E 



\u\<l/h 



Zi{e,u)Z2{e,u)dw{u) 



< 



16 



;i-2P)4n2' 



since from Lemma [T] we have \Zk{e,u)\ < 2(1 - 2P)-^. In addition. 



-E 



n 



\u\<l/h 



Zi{0,u)J{e,u)dW{u) 



n \J\u\<l/h 



j'^{e,u)dW{u) 



It is obvious that J|„|<x//j J'^{0,u)dW{u) — )• —AS{9) as /i — )• 0. As for the first term, we 
use that \ J{9,u)\ < 2(1 - 2P)"i For all u G M and 6* e 6 and we write 

E I / Zi{e,u)J{9,u)dWiu)] < 



\u\<l/h 



[1 - 2Pf 
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Lemma 3 i) The function S is Lipschitz over Q. 
ii) The empirical contrast Sn defined in ^ is Lipschitz over Q. 
Hi) The empirical contrast Sn defined in |^ is such that Sn is Lipschitz over Q. 

Proof, i) According to the mean value theorem, we write 

s{e)-s{e') = -] [ [j^{e,u) - j\9',u)]dw{u) 



3')^ ■ j{du,u)J{eu,u)dW{u), 



where for all u G M, 0^ lies in the line segment with extremities 6 and 9' . By Cauchy- 
Schwarz inequality, 

\s{d)-s{d')\<\\\e-e'\\- I \\j{eu,u)\\-\j{eu,u)\dw{u). 

By Lemma|l| \S{e) - S{e')\ < 4(1 - 2P)-'^ /(I + \u\)dW{u) ■ \\9 - e'\\. 
ii) Very similarly, 

1 " r 

Snie)-Sn{e') = -— — - V / {0-eY ■vizki9,u)z,ie,u))\e=ejwiu) 

An{n - 1) J\u\<i/h 



1 I- 

= — TV V / {o-e'Y ■Zk{eu,u)Zj{e^,u)dw{u), 

where for all tt G M, 0^ hes in the line segment with extremities and 0' . Therefore 

\Sn{e) - Sn{e')\ < j^^A^P - o'w ■ ^(i + \u\)dw{u). 

Indeed, by Lemma [l] Zj and Z^ have the same upper bounds as J and j, respectively, 
iii) We have 

Sn{0) = ^ V [ \zk{e,u)Zj{9,u) + Zk{e,u)z,{e,uf] dw{u). 

2n(n- 1) f-^y|„l<i/;, L J 
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We shall bound from above as follows 

\\Sn{e,u)-Sn{e',u)\\2 < 



—y 

2n(n-l)^ 



\u\<l/h 



iZk{e,u) - Zkie',u))Zji9,u)dw{u] 



+ 



+ 



+ 



\u\<l/h 



Zki9',u){Zj{e,u) - Zj{e',u))dWiu) 



\u\<l/h 



Zk{e,u){Zj{e,u) - Zj{e',u))'^dw{u) 



\u\<l/h 



{Zki9,u) - Zk{9',u))Z,{9',u)'^dW{u 



For each term in the previous sum, we use Taylor expansion and Lemmas [T] and [2] to get 

a,uC f{l + \u\ + u^ + \uf)dW{u) 



Sn{e,u)-Sn{e',u) 



< 



2 " II (1-2P)5 

for some constant C > 0, which finishes the proof by our Assumption A. ■ 
Proof of Theorem Our method is based on a consistency proof for miminum contrast 
estimators by Dacunha-Castelle and Duflo (1993, p. 94-96). Let us consider a countable 
dense set D in 0, then infgge 5„(0) = infeg^j Sn{9), is a measurable random variable. We 
define in addition the random variable 

Win,0=sup{\Sn{e)-Sn{9')\; {9,9') G \\9-9'\\<(}, 

and recall that ^(^o) = 0. Let us consider a non-empty open ball Bq centered on 6*0 such 
that S is bounded from below by a positive real number 2e on Q\Bq. Let us consider us 
consider a sequence {^,p)p>i decreasing to zero, and take p such that there exists a covering 
of Q\Bq by a finite number £ of balls (-Bj)i<i<£ with centers G G, i = 1, . . . , £, and radius 
less than ^p. Then, for all 9 G Bi, we have 

Sn{e) > Sn{9i)-\Sn{9)-Sn{9^)\ 

> Sn{9i) - sup \Sn{9) - Sn{9i)\, 

which leads to 



inf Sn{9)> inf Sn{9i)-W{n,ip). 
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As a consequence we have the fohowing events inclusions 
lOn^Bo} C I inf Sn{9) < Sn{9o)] 

I J [eee\Bo J 



C <! ^mf ^ Sn{e^) - W{n, Q < SniOo] 



C {W{n,Cp) >e}u{ inf - 5„(0o)) < e 

l<i<t 



Thus we have 

{^n ^ So} C {Ty(n,Cp) >e}U j^inf - 5„(eo)) < • (13) 

By the convergence given in Proposition [2] we have 



P{mfJSn{e^)-SM)<e 



<i-l[(i-p{Sniei)-sieo)<e))) 

i=l 

e 

< 1 _ _ p{Sr,{ei) - s{e,) + Sn{9o) - s{eo) < e - {s{9,) - sieo))) 

1=1 

e 

< 1 _ _ p{Sr,{ei) - s{e,) + Sn{9o) - s{eo) < -e))) 

1=1 

< 1 _ _ p{\s^{ei) - s{ei)\ + \Sn{e») - s(0o)l > e))) 

i=l 

< 1 _ _ [p{\Sn{ei) - s{ei)\ > e)) + p{\Sn{9o) - s{eo)\ > e)]) 



i=l 

where the last term in the right hand side of the above inequality vanishes to zero according 
to Proposition [2j Because Sn is Lipschitz over by Lemma |3} we have that for sufficiently 
large p, \Sn{e) - Sn{9')\ < e/2 for all {6,6') such that \6 - 6'\2 < Cp, thus P{W{n,^p) > 
e) = 0. We just proved the consistency in probability of the contrast estimator 6n defined 
in ^. m 

Proof of Theorem^ By a Taylor expansion of Sn around we have 

= Sn{6n) = Sn{6o) + Sn{ei){6n " ^o) (14) 

where ^* lies in the line segment with extremities 6n and ^o- 
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Step 1. Let us prove that 

Snido, 



2n(n 



^^ V / Zk{e,u)Zj{e,u)dwiu) 



(15) 



is asymptotically normal y/nSnido) N{Q,V), in distribution. 
Indeed, S{9q) = and J{0o, u) = for all u G M imply that 

E[Sn{eo)] = -\ f j{eo,u)J{9o,u)dWiu) = 0. 

^ J\u\<l/h 

Therefore we decompose 



—1 r 



y 

2n{n -1) _ 



Zk{eo,u) - j{9o,u) Zj{9o,u)dW{u) 



1 

-T 

2n ^ 



Zfc(0o, u)j{9Q, u)dW{u) =: An + -B„. 



k=iJ\n\<l/h 

We shall see that s/nBn gives the dominant behaviour in the limit in distribution. 
Indeed, 



< 



< 



||nVar(^„)|| 
' E 



4(n- 1) 
C 



\u\<l/h 



Z^{9Q,u)Z^{9Q,u)dW{u) 



\u\<l/h 



Z^{9Q,u)Z2{9Q,u)dW{u) 



j {l + \u\)dW{u)^ =o(l). 



(1 - 2Pfn 

The asymptotic behaviour of the distribution of ^/nBn is obtained by noticing that 

\/nBn = Cn + Dn, where 
1 " 

^^"^ fc=i 
1 " 

^^"^ fe=i 

and 

Uk,n{do)= [ Zki9o,u)j{9o,u)dW{u) 

J\u\<\/h 
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is a centered variable which depends on n via h, 



Zk{9o,u)J{9o,u)dW{u) 



is a centered variable not dependent on n. Note that Dn = op(l) as 
Var (^iUk,nm - Uk{eo) 



\k=l 



< 



< 



< 



E 



Zi{eo,u)j{eo,u)dw{u) ■ [ Zi{eo,u)j{eo,u)'^dw{u) 

\u\>l/h J\u\>l/h 



\u\>l/h 



\u\>i/h (1 - 
oil) 



( 2 \ 



V 1^1 / 



dW{u) 



\u\>l/h 



(1 - 2P)2 



V 1^1 / 



dW{u] 



(1-2P)4' 

as — )• 0, since every integral in the finite sum tends to when /i — )• 0. In a standard way, 
Cn satisfies the following central limit theorem: 



1 " 
^ k=i 



n — )• oo, 



(16) 



where V denotes covariance matrix of Ui{6o) which is equal to 1/4 • E{Ui{6q)Ui{6q)^) 
(and cannot be explicited due to the integral nature of the terms) . 
Step 2. Let us prove that 



where X = X(0o) = -^f j{9o,u)j'^ {6o,u)dW{u). 
We start by writing the triangular inequality 



(17) 



15, 



n < \\Snie*J-Snieo)\\ + \\Sn{9o)-I\\. 



Then we use the Lipschitz property of Sn, Lemma [2|, and the convergence in probability 
of 9n to 6*0- Finally, we compute the limit of Sn{9o). Indeed 

1 



E{Sn{9o)) 



\u\<l/h 



{J{9o,u)J{9o,u) + J{9o,u) ■ J{9o,u) ' )dW{u) 



\u\<l/h 



J{9o,u) ■ J{9o,uydW{u), 
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as J(0o, u) = 0. We see that E{Sni9o)) ^ I{Oo), ash^O. m 
Proof of the Theorem^ 
Note first that 



E{fn{x)) = E 



2tt 



1 

2^ 



e-'^^g*{u)K*{bnu)E 



1 



du. 



Recah that supege > 1-2P, which means that £^(M-^(^„ _i, u)) < (1-2P)-^ 



Let us write the usual bias-variance decomposition. For the bias, we have 



Eifnix)) - fix) 



1 

2^ 
1 

2^ 



1 



1 



M(^„,_i,n)/ M{9o,u) 



1 

2^ 



{K*{bnu) - l)du. 



du 



du 



M{eo,u) 

Next, we use the facts that | sup^ K* {u)\ < 1 and that the support of K*{bnu) is included 
in {u : \u\ > 1/fen} and get 

\E{fnix))-f{x)\ < ^(^j\g*iu)\\EiM-\en,-i),u)-M-\eo,u)\du 

+ / \9*iu)\du] 

i - ^P J\u\>l/b„ J 
,/3-l/2 



o(^) + o(i)^. 

/n 1 — zP 



For the variance, we write 
Var{fn{x)) 



E 



(2vrng/ 



e-'^^K*{bnu) 



< 



< 



1 



1 



E 



E 



\ 2 



g*iu)E 



M{en,-i,u) 



du 



K*{bnU) 



M(^„,_i,m) 



du] /X2,...,Xn 



4^2 



-p 



n 



K*{bnU) du 



< 



ll^llill/lli 



47r2(l - 2PYnbn' 
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Therefore, for 6„ = cn i/2)/(2^) -^g gg^ ^j^g upper bounds in our theorem. ■ 
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